Search results for "information extraction"
showing 10 items of 25 documents
Recent Advances in Techniques for Hyperspectral Image Processing
2009
International audience; Imaging spectroscopy, also known as hyperspectral imaging, has been transformed in less than thirty years from being a sparse research tool into a commodity product available to a broad user community. Currently, there is a need for standardized data processing techniques able to take into account the special properties of hyperspec- tral data. In this paper, we provide a seminal view on recent advances in techniques for hyperspectral image processing. Our main focus is on the design of techniques able to deal with the high-dimensional nature of the data, and to integrate the spa- tial and spectral information. Performance of the discussed techniques is evaluated in …
BIOfid dataset: publishing a German gold standard for named entity recognition in historical biodiversity literature
2019
The Specialized Information Service Biodiversity Research (BIOfid) has been launched to mobilize valuable biological data from printed literature hidden in German libraries for over the past 250 years. In this project, we annotate German texts converted by OCR from historical scientific literature on the biodiversity of plants, birds, moths and butterflies. Our work enables the automatic extraction of biological information previously buried in the mass of papers and volumes. For this purpose, we generated training data for the tasks of Named Entity Recognition (NER) and Taxa Recognition (TR) in biological documents. We use this data to train a number of leading machine learning tools and c…
Human-in-the-Loop Conversation Agent for Customer Service
2020
This paper describes a prototype system for partial automation of customer service operations of a mobile telecommunications operator with a human-in-the loop conversational agent. The agent consists of an intent detection system for identifying the types of customer requests that it can handle appropriately, a slot filling information extraction system that integrates with the customer service database for a rule-based treatment of the common scenarios, and a template-based language generation system that builds response candidates that can be approved or amended by customer service operators. The main focus of this paper is on the system architecture and machine learning system structure …
ImageRover: A Content-Based Image Browser for the World Wide Web
1997
ImageRover is a search-by-image-content navigation tool for the World Wide Web (WWW). To gather images expediently, the image collection subsystem utilizes a distributed fleet of WWW robots running on different computers. The image robots gather information about the images they find, computing the appropriate image decompositions and indices, and store this extracted information in vector form for searches based on image content. At search time, users can iteratively guide the search through the selection of relevant examples. Search performance is made efficient through the use of an approximate, optimized k-d tree algorithm. The system employs a novel relevance feedback algorithm that se…
Creating a semantically-enhanced cloud services environment through ontology evolution
2014
Currently, the availability of Web resources has grown enormously to the point that whatever a user needs at a given moment can potentially be found on the Internet. These resources are not limited to data items anymore, functionality delivered through some sort of service architectural model is also offered on the Internet. In the last few years, cloud computing has emerged as one of the most popular computing models to provide services over the Internet. However, as the number of available cloud services increases, the problem of service discovery and selection arises. Experience indicates that semantic technologies can provide the basis for enhanced and more precise search processes. In …
Information Abstraction from Crises Related Tweets Using Recurrent Neural Network
2016
Social media has become an important open communication medium during crises. The information shared about a crisis in social media is massive, complex, informal and heterogeneous, which makes extracting useful information a difficult task. This paper presents a first step towards an approach for information extraction from large Twitter data. In brief, we propose a Recurrent Neural Network based model for text generation able to produce a unique text capturing the general consensus of a large collection of twitter messages. The generated text is able to capture information about different crises from tens of thousand of tweets summarized only in a 2000 characters text.
Cueing animations: Dynamic signaling aids information extraction and comprehension
2013
The effectiveness of animations containing two novel forms of animation cueing that target relations between event units rather than individual entities was compared with that of animations containing conventional entity-based cueing or no cues. These relational event unit cues (progressive path and local coordinated cues) were specifically designed to support key learning processes posited by the Animation Processing Model (Lowe & Boucheix, 2008). Four groups of undergraduates (N ¼ 84) studied a usercontrollable animation of a piano mechanism and then were assessed for mental model quality (via a written comprehension test) and knowledge of the mechanism’s dynamics (via a novel non-verbal …
2021
Strength training exercises are essential for rehabilitation, improving our health as well as in sports. For optimal and safe training, educators and trainers in the industry should comprehend exercise form or technique. Currently, there is a lack of tools measuring in-depth skills of strength training experts. In this study, we investigate how data mining methods can be used to identify novel and useful skill patterns from a binary multiple choice questionnaire test designed to measure the knowledge level of strength training experts. A skill test assessing exercise technique expertise and comprehension was answered by 507 fitness professionals with varying backgrounds. A triangulated appr…
Sectors on sectors (SonS): A new hierarchical clustering visualization tool
2011
Clustering techniques have been widely applied to extract information from high-dimensional data structures in the last few years. Graphs are especially relevant for clustering, but many graphs associated with hierarchical clustering do not give any information about the values of the centroids' attributes and the relationships among them. In this paper, we propose a new visualization approach for hierarchical cluster analysis in which the above-mentioned information is available. The method is based on pie charts. The pie charts are divided into several pie segments or sectors corresponding to each cluster. The radius of each pie segment is proportional to the number of patterns included i…
<title>Dynamic integration of multiple data mining techniques in a knowledge discovery management system</title>
1999
One of the most important directions in improvement of data mining and knowledge discovery, is the integration of multiple classification techniques of an ensemble of classifiers. An integration technique should be able to estimate and select the most appropriate component classifiers from the ensemble. We present two variations of an advanced dynamic integration technique with two distance metrics. The technique is one variation of the stacked generalization method, with an assumption that each of the component classifiers is the best one, inside a certain sub area of the entire domain area. Our technique includes two phases: the learning phase and the application phase. During the learnin…